Former quant
Statistical volatility arbitrage
Options on equities and equity indexes
What is ML?
Survey of ML
Miscellanea
Different meanings
Machine Learning
Artificial Intelligence
Statistical Learning
Applied Statistics
Historical context important
ML primarily from CS / EE
‘Engineering’ mentality
Tabular based
## default student balance income
## 1 No No 729.526 44361.63
## 2 No Yes 817.180 12106.13
## 3 No No 1073.549 31767.14
## 4 No No 529.251 35704.49
## 5 No No 785.656 38463.50
## 6 No Yes 919.589 7491.56
## 7 No No 825.513 24905.23
## 8 No Yes 808.668 17600.45
## 9 No No 1161.058 37468.53
## 10 No No 0.000 29275.27
## 11 No Yes 0.000 21871.07
## 12 No Yes 1220.584 13268.56
## 13 No No 237.045 28251.70
## 14 No No 606.742 44994.56
## 15 No No 1112.968 23810.17
Exchangeability
Predictive accuracy
De-emphasises inference / uncertainty / explainability
Discoverability of model parameters
Example of linear models
Scaling issues
Automated ML pipelines
Software engineering
Training-test split
\(k\)-fold
Train-validation-test split
Labelled data
\[
\begin{eqnarray*}
\text{Discrete output} &\rightarrow& \text{Categorisation} \\
\text{Continuous output} &\rightarrow& \text{Regression}
\end{eqnarray*}
\]
\[ y = \beta_0 + \beta_1 \phi_1(X_1) + ... + \beta_n \phi_n(X_n) + \epsilon \]
Linear in parameters \(\beta\)
## default student balance income
## 1 No No 729.526 44361.63
## 2 No Yes 817.180 12106.13
## 3 No No 1073.549 31767.14
## 4 No No 529.251 35704.49
## 5 No No 785.656 38463.50
## 6 No Yes 919.589 7491.56
## 7 No No 825.513 24905.23
## 8 No Yes 808.668 17600.45
## 9 No No 1161.058 37468.53
## 10 No No 0.000 29275.27
## 11 No Yes 0.000 21871.07
## 12 No Yes 1220.584 13268.56
## 13 No No 237.045 28251.70
## 14 No No 606.742 44994.56
## 15 No No 1112.968 23810.17
Simple to understand
Highly explainable
Prone to overfitting
Ensemble of trees
Aggregate low-bias trees to reduce variance
Sample of rows, constrain splits
Self-tuning (mostly)
Ensemble of trees
Aggregate low-variance trees to reduce bias
Probably most performant approach
Tuning more involved
Uses kernel functions
Avoids co-ordinate transforms
Geometric method
Divides ‘feature space’ into regions
Unlabelled data
Many variables (sometimes thousands)
Correlated / dependent / useless
Reduce dimensionality without losing information
Unsupervised (clustering)
Topic modelling
Lots of functionality
Words as vectors
Semantic meaning
\[
\text{King} - \text{Male} + \text{Female} \approx \text{Queen}
\]
Thank You!!!